Python Basics

Disclaimer - This document is only meant to serve as a reference for the attendees of the workshop. It does not cover all the concepts or implementation details discussed during the actual workshop. The content of this notebook was adapted from Python Basics Workshop and Interactive Python

Python what?!

  • Python is a general purpose programming language. Other examples include Java, C, C++ etc.
  • It is known for its remarkable power coupled with readable syntax.
    • The syntax of a programming language is the set of rules that defines the combinations of symbols that are considered to be correctly structured pieces of code. Think: Spelling & Grammar.
  • It is one of the easiest programming languages to learn but is also one of the most powerful languages, used heavily in Machine learning and Data Science.
  • Python code is executed line-by-line in a sequential fashion.

Basic Syntax Simplicity


In [12]:
print("This is an example of Python code.")
print()
a = int(input("Enter a value for a: "))
b = int(input("Enter a value for b: "))
print("The sum of a & b is: " + str(a + b))  # print the sum of a & b


This is an example of Python code.

Enter a value for a: 10
Enter a value for b: 20
The sum of a & b is: 30

Statement: Each line of code in a Python program is called a statement. Python interprets statements and runs them one by one.

Comments: The # symbol indicates a comment and anything after # is ignored by the computer. Comments provide information to improve code readability.

Built-in Functions: Python comes with many built-in functions like print() & input() to help you in writing your code.


Variables

What are variables?

Recall variables from algebra — x = 5. In programming, variables are used to store data temporarily to make it easier to refer to them. The value referred to by a variable can be updated as you execute the program using the assignment operator i.e. =.


In [13]:
a = 3       # simple assignment
a = a + 1   # RHS is evaluated first
print(a)    # 4


4

There are four basic data types in Python:

  1. Integer - Whole numbers. E.g. 1, 123, 89
  2. Float - Floating decimal point numbers. E.g. 1.3, 82.4, 3.14159…
  3. String - Sequence of characters. E.g. a, abc, abc def
  4. Boolean - Two possible values — True or False.

In [14]:
a = 5
b = 4.3
c = "Hello, world"
d = True

It is possible to convert a variable from one type to another if the conversion is compatible. For example, converting "123" (string) to 123 (integer) is possible but "hello" cannot be converted to an integer.


In [15]:
strA = "123"
intA = int(strA)      # 123
floatA = float(strA)  # 123.0

There are two main built-in numeric classes that implement the integer and floating point data types.
These Python classes are called int and float.

The standard arithmetic operations : +, -, *, /, and ** (exponentiation), can be used with parentheses forcing the order of operations away from normal operator precedence.Other very useful operations are the remainder modulo operator, %, and integer division, //.

Note : when two integers are divided, the result is a floating point. The integer division operator actually rounds down the number to the closest integer.(Note : Even with Negative number)


In [16]:
print(2+3*4)    #14
print((2+3)*4)  #20
print(2**10)    #14
print(6/3)      #2.0
print(7/3)      #2.33333333333335
print(7//3)     #2
print(-5//4)    #-2
print(7%3)      #1
print(3/6)      #0.5
print(3//6)     #0
print(3%6)      #3
print(2**100)   #1267650600228229401496703205376


14
20
1024
2.0
2.3333333333333335
2
-2
1
0.5
0
3
1267650600228229401496703205376

The boolean data type, implemented as the Python bool class, will be quite useful for representing truth values. The possible state values for a boolean object are True and False with the standard boolean operators, and, or, and not.


In [17]:
print(True)
print(False)
print(False or True)
print(not (False or True))
print(True and True)


True
False
True
False
True

Boolean data objects are also used as results for comparison operators such as equality == and greater than >. In addition, relational operators and logical operators can be combined together to form complex logical questions.

Operation Name Operator Explanation
less than < Less than operator
greater than > Greater than operator
less than or equal <= Less than or equal to operator
greater than or equal >= Greater than or equal to operator
equal == Equality operator
not equal != Not equal operator
logical and and Both operands True for result to be True
logical or or One or the other operand is True for the result to be True
logical not not Negates the truth value, False becomes True, True becomes False

In [18]:
print(5==10)
print(10 > 5)
print((5 >= 1) and (5 <= 10))


False
True
True

Reassignment Operators

Because increment and re-assign operation is very common, Python includes special operators for it:


In [19]:
manila_pop = 1780148
print(manila_pop)
manila_pop += 1675 # increase the value of manila_pop by 1675
manila_pop -= 250  # decrease the value of manila_pop by 250
manila_pop *= 0.9  # decimate manila_pop
manila_pop /=  2   # approximate the female population of Manila


1780148

Multiple assignment

It is also possible to assign two variables on a single line:


In [20]:
#These two assignments can be abbreviated
savings = 514.86
salary = 320.51

#Using multiple assignment
savings, salary = 514.86, 320.51

Printing Variables

In Python, we can print text onto the console using the print() function.


In [21]:
print("Hello, world")

a = 5
print(a)
print("The value of 'a' is: " + str(a))


Hello, world
5
The value of 'a' is: 5

Getting User Input

In Python, we can get input from the user and store this in variables using the input() function. The function prints whatever prompt is passed to it and waits for the user input. Once the Enter key is pressed, it stores the user input into the variable as a string.


In [22]:
name = input("Enter your name: ")
print(name)

age = int(input("Enter your age: "))
print(age)


Enter your name: Shantanu
Shantanu
Enter your age: 21
21

Formatting Output

We have already seen that the print function provides a very simple way to output values from a Python program. print takes zero or more parameters and displays them using a single blank as the default separator. It is possible to change the separator character by setting the sep argument. In addition, each print ends with a newline character by default. This behavior can be changed by setting the end argument.


In [23]:
print("Hello")

print("Hello","World")

print("Hello","World", sep="***")

print("Hello","World", end="***")
print("Next character will start after the ***")


Hello
Hello World
Hello***World
Hello World***Next character will start after the ***

It is often useful to have more control over the look of your output. Fortunately, Python provides us with an alternative called formatted strings. A formatted string is a template in which words or spaces that will remain constant are combined with placeholders for variables that will be inserted into the string.


In [24]:
print(name, "is", age, "years old.")
print("%s is %d years old." % (name, age))


Shantanu is 21 years old.
Shantanu is 21 years old.

The % operator is a string operator called the format operator. The left side of the expression holds the template or format string, and the right side holds a collection of values that will be substituted into the format string.

Note : the number of values in the collection on the right side corresponds with the number of % characters in the format string. Values are taken—in order, left to right—from the collection and inserted into the format string.

Character Output Format
d, i Integer
u Unsigned integer
f Floating point as m.ddddd
e Floating point as m.ddddde+/-xx
E Floating point as m.dddddE+/-xx
g Use %e for exponents less than −4 or greater than +5, otherwise use %f
c Single character
vs String, or any Python data object that can be converted to a string by using the str function.
% Insert a literal % character

In addition to the format character, you can also include a format modifier between the % and the format character.
Format modifiers may be used to left-justify or right-justifiy the value with a specified field width.
Modifiers can also be used to specify the field width along with a number of digits after the decimal point.

Modifier Example Description
number %20d Put the value in a field width of 20
- %-20d Put the value in a field 20 characters wide, left-justified
+ %+20d Put the value in a field 20 characters wide, right-justified
0 %020d Put the value in a field 20 characters wide, fill in with leading zeros.
. %20.2f Put the value in a field 20 characters wide with 2 characters to the right of the decimal point.
(name) %(name)d Get the value from the supplied dictionary using name as the key.

In [25]:
price = 24
item = "banana"
print("The %s costs %d cents"%(item,price))
print("The %+10s costs %5.2f cents"%(item,price))
print("The %+10s costs %10.2f cents"%(item,price))
print()
itemdict = {"item":"banana","cost":24}
print("The %(item)s costs %(cost)7.1f cents"%itemdict)


The banana costs 24 cents
The     banana costs 24.00 cents
The     banana costs      24.00 cents

The banana costs    24.0 cents

Python strings also include a format method that can be used in conjunction with a new Formatter class to implement complex string formatting. More about these features can be found in the Python library reference manual.


Arithmetic Operations

Variables can be used to perform simple arithmetic operations. For example:


In [26]:
a = 4
b = 5
c = a + b	# addition
d = a - b	# subtraction
e = a * b	# multiplication
f = a / b	# division
g = a % b	# modulus (remainder)

Strings

Strings represent a sequence of one or more characters. Each character can be accessed separately using a zero-based index.

H  e  l  l  o  
0  1  2  3  4

In [27]:
s = "Hello"
print(s[1])
print(len(s))
print(s + ", world")


e
5
Hello, world

Python comes with a lot of useful built-in methods for strings. Some commons ones are listed below. You can find more at String Methods

Method Name Use Explanation
center astring.center(w) Returns a string centered in a field of size w
count astring.count(item) Returns the number of occurrences of item in the string
ljust astring.ljust(w) Returns a string left-justified in a field of size w
lower astring.lower() Returns a string in all lowercase
rjust astring.rjust(w) Returns a string right-justified in a field of size w
find astring.find(item) Returns the index of the first occurrence of item
split astring.split(schar) Splits a string into substrings at schar

In [28]:
print(s.lower())                   # returns lowercase version of string
print(s.upper())                   # returns uppercase version of string
print(s.find("abc"))               # searches the string for "abc" and returns the first index where it begins or -1
print(s.replace("old", "new"))     # returns a string with all occurrences of "old" replaced with "new"
print(s.center(10))                # returns the string centered in a 10 character space
print(s.split('e'))                # returns the an array with the string characters split on the occurences of e
print(s.islower())
print(s.title())
print(s.count("l"))


hello
HELLO
-1
Hello
  Hello   
['H', 'llo']
False
Hello
2

If you want to access a part of a string, Python allows you to slice the string using the start and end indices.


In [29]:
s = "hello"
print(s[1:4])   # "ell"
print(s[1:])    # "ello"
print(s[:3])    # "hel"
print(s[:])     # "hello"


ell
ello
hel
hello

A major difference between lists and strings is that lists can be modified while strings cannot. This is referred to as mutability. Lists are mutable; strings are immutable.

String formatting

One particularly useful string method is format. The format method is used to construct strings by inserting values into template strings. Consider this example.


In [30]:
name = "Shantanu"
lang = "Python"
ver = 3.6
message = "Hi {}, Welome to coding with {} v{}".format(name, lang, ver)
print(message)


Hi Shantanu, Welome to coding with Python v3.6

Lists

Instead of variables holding individual values, it is also possible to create a list of values in Python. Each item in the list is called an element and can be accessed individually using a zero-based index. ( For more information look into Python Advanced notebook. )


In [31]:
listOfNums = [1, 2, 3, 4, 5]
print(listOfNums)
print(listOfNums[1])
print(len(listOfNums))


[1, 2, 3, 4, 5]
2
5

Tuples

Tuples are very similar to lists in that they are heterogeneous sequences of data. The difference is that a tuple is immutable, like a string. A tuple cannot be changed. Tuples are written as comma-delimited values enclosed in parentheses. As sequences, they can use any operation described above.


In [32]:
myTuple = (2,True,4.96)
print(myTuple)

len(myTuple)
print(myTuple[0])
print(myTuple * 3)
print(myTuple[0:2])


(2, True, 4.96)
2
(2, True, 4.96, 2, True, 4.96, 2, True, 4.96)
(2, True)

However if you try to change an element in a tuple you will get an error


In [33]:
# (Uncomment to see error. ) 
# myTuple[1]=False

Sets

A set is an unordered collection of zero or more immutable Python data objects. Sets do not allow duplicates and are written as comma-delimited values enclosed in curly braces. The empty set is represented by set(). Sets are heterogeneous, and the collection can be assigned to a variable as below. They are always unordered.


In [34]:
# empty_set = set()
mySet = {3,6,"cat",4.5,False}
print(mySet)
mySet = {"cat",3,3,False,3,3,6,6,6,6,"cat","cat",4.5,False}
print(mySet)


{False, 3, 'cat', 4.5, 6}
{False, 3, 'cat', 4.5, 6}

Even though sets are not considered to be sequential, they do support a few of the familiar operations presented earlier.

Operation Name Operator Explanation
membership in Set membership
length len Returns the cardinality of the set
| aset | otherset Returns a new set with all elements from both sets
& aset & otherset Returns a new set with only those elements common to both sets
- aset - otherset Returns a new set with all items from the first set not in second
<= aset <= otherset Asks whether all elements of the first set are in the second

Sets support a number of methods that should be familiar to those who have worked with them in a mathematics setting.

Method Name Use Explanation
union aset.union(otherset) Returns a new set with all elements from both sets
intersection aset.intersection(otherset) Returns a new set with only those elements common to both sets
difference aset.difference(otherset) Returns a new set with all items from first set not in second
issubset aset.issubset(otherset) Asks whether all elements of one set are in the other
add aset.add(item) Adds item to the set
remove aset.remove(item) Removes item from the set
pop aset.pop() Removes an arbitrary element from the set
clear aset.clear() Removes all elements from the set

Dictionaries

Dictionaries in Python allow you to store key-value pairs. Keys are unique within a dictionary while values may be repeated. For example, a dictionary of names (keys) and ages (values). Because the keys are unique, they are used as the index of the dictionary to access the values.


In [2]:
ages = {"Bob" : 21, "Jake" : 24, "John" : 23}
print(ages["Bob"])   # 21
print(ages["Jake"])  # 24

ages["Jane"] = 23
print(ages)          # {"Bob" : 21, "Jake" : 24, "John" : 23, "Jane" : 23}

ages["Bob"] = 24
print(ages)          # {"Bob" : 24, "Jake" : 24, "John" : 23, "Jane" : 23}

print(ages.keys())   # dict_keys(["Bob", "Jake", "John", "Jane"])
print(ages.values()) # dict_values([24, 24, 23, 23])

# Accessing element that doesn't exist gives KeyError
# print(ages["Rob"])


21
24
{'Bob': 21, 'Jake': 24, 'John': 23, 'Jane': 23}
{'Bob': 24, 'Jake': 24, 'John': 23, 'Jane': 23}
dict_keys(['Bob', 'Jake', 'John', 'Jane'])
dict_values([24, 24, 23, 23])

Handling the KeyError

There are two ways to avoid the KeyError problem. Examples below:


In [10]:
if "Bob" in ages :
    print("Bob is already in dictionary.")
else :
    print("Bob is not in dictionary.")


Bob is already in dictionary.

In [11]:
if ages.get("Rob") is None:
    print("Rob is not in dictionary.")
else :
    print("Rob is already in dictionary.")


Rob is not in dictionary.

Control Flow

While sequentially executing code can perform a lot of tasks for us, many tasks require some subtask to be repeated or selectively executed. For example, if we want to calculate the average age in a classroom:


In [36]:
total = 0
numOfStudents = 0

studentAge = int(input("What's your age? "))
total = total + studentAge
numOfStudents = numOfStudents + 1

studentAge = int(input("What's your age? "))
total = total + studentAge
numOfStudents = numOfStudents + 1

studentAge = int(input("What's your age? "))
total = total + studentAge
numOfStudents = numOfStudents + 1

average = total / numOfStudents
print(average)


What's your age? 12
What's your age? 11
What's your age? 10
11.0

Repeating the same lines of code is tedious and limits the reusability & maintainability of the code. For example, in the above code, if we just wanted to change the user prompt to Enter your age: instead, we would have to change many lines of code.


if Statements

if statements are one of the most important control flow tools available in programming. They allow the program to dynamically choose which lines of code to execute depending on a particular condition.

if <condition>:
    # Statements here are executed if condition is True
else:
    # Statements here are executed if condition is False
if <conditionA>:
    # Statements here are executed if conditionA is True
elif <conditionB>:
    # Statements here are executed if conditionA is False and conditionB is True
else:
    # Statements here are executed if conditionA and conditionB are False

A colon marks the beginning of a block and statements within each block are indented so that they can be distinguished from each other.

Condition

Conditions are boolean expressions that return either True or False. They are made up of relational operators and logical operators.

Relational Operators
  1. ==: equals to
  2. !=: not equals to
  3. < : smaller than
  4. > : greater than
  5. <=: smaller than or equals to
  6. >=: greater than or equals to
if x == y:
    print("x & y are equal.")
else:
    print("x & y are not equal.")
Logical Operators
  1. boolExpr1 and boolExpr2: Both expressions have to be True for the entire expression to be True
  2. boolExpr1 or boolExpr2: Either one expression has to be True for the entire expression to be True
  3. not boolExpr: Inverts the return value of the expression
if (x == y) and (y == z):
    print("x, y & z are equal.")

Loops

Loops in programming allow the same task to be executed repeatedly for a definite or indefinite number of times. They are made up of four main sections:

  1. Initialize: Initialize the loop control variable.
  2. Test: Continue the loop?
  3. Loop Body: Task being repeated.
  4. Update: Modify the loop control variable so that the next time we test, we may exit the loop.

There are two main kinds of loops:

  1. Counter Controlled: The number of repetitions of the loop is known before the loop body starts executing.
  2. Sentinel Controlled: The number of repetitions of the loop is unknown before the loop body starts executing.

while Loop

A while loop executes while the given condition is True (i.e. until the given condition is False).

while <condition>:
    # Loop Body

For example, to print all the integers from 1 - 10:


In [37]:
num = 1                 # Initialize
while num <= 10:        # Test
    print(num)          # Loop Body
    num = num + 1       # Update


1
2
3
4
5
6
7
8
9
10

An example of a sentinel controlled loop:


In [38]:
answer = 5
guess = int(input("Guess a number from 1 - 10: "))
while guess != answer:
    guess = int(input("Wrong! Guess a number from 1 - 10: "))
print("Correct!")


Guess a number from 1 - 10: 2
Wrong! Guess a number from 1 - 10: 3
Wrong! Guess a number from 1 - 10: 4
Wrong! Guess a number from 1 - 10: 5
Correct!

for Loop

A for loop can be used to iterate through a list and perform tasks accordingly.

for var in someList:
    # Loop Body

In every iteration of the loop, the var variable will be assigned the next value in the list. For example, the code below will print all the elements in the list in separate lines:


In [39]:
numList = [2, 1, 4, 5, 3]
for num in numList:
    print(num)


2
1
4
5
3

A for loop can also be used to iterate through a dictionary:


In [40]:
phoneBook = {"Bob" : "555-555", "Jake" : "555-556", "John" : "555-565", "Jane" : "555-655"}
for name in phoneBook.keys():
    print(name + " : " + phoneBook[name])


Bob : 555-555
Jake : 555-556
John : 555-565
Jane : 555-655

Alternatively, a for loop can be used like a while loop with a definite number of repetitions using the range() function. The range() function returns a list of integers from 0 up till but not including the passed argument.


In [41]:
for i in range(5):  # range(5) returns [0, 1, 2, 3, 4]
    print(i)


0
1
2
3
4

Functions

Functions allow you to divide your code into chunks that do specific subtasks, which make your code more readable and reusable.

# Define the function
def functionName():
    # Function Body

functionName()  # Call the function

Parameters & Arguments

Functions can be defined with parameters, which are essentially variables that are set when the function is called using arguments. This allows functions to be reused with different data values.


In [42]:
def sum(a, b):  # a & b are parameters
    c = a + b
    print(c)

sum(3, 2)       # 3 & 2 are arguments
sum(32, 65)     # 32 & 65 are arguments


5
97

Return Values

Functions can also return values back to where the function was called so that the result can be further used.

def sum(a, b):

c = a + b
return c

s = sum(1, 2) # s is set to the value of c from sum()
print(s)

Documenting functions

One of the key advantages of functions is that they can help to break a program down into smaller chunks. This makes code easier to write, and also easier to read because they're reusable. Functions make code easier to read because they give human-readable names to processes. While the population density formula isn't that complicated, it is still harder to recognize than a precisely named function. There is one further technique for making functions more readable, documentation strings (also called "docstrings"). Docstrings are a type of comment used to explain the purpose of a function, and how it should be used. Here's the population_density function, with a docstring.


In [1]:
def population_density(population=False, land_area):
    """Calculate the population density of an area.

    population: int. The population of the area
    land_area: int or float. This function is unit-agnostic, if you pass
               in values in terms of square km or square miles the
               function will return a density in those units.
    """
    return population / land_area


  File "<ipython-input-1-f79ab7be92bc>", line 1
    def population_density(population=False, land_area):
                          ^
SyntaxError: non-default argument follows default argument

Docstrings are surrounded by triple quotes, """. The first line of the docstring is a brief explanation of the function's purpose. If you feel that this is sufficient documentation you can end the docstring at this point, single line docstrings are perfectly acceptable. If you think that the function is complicated enough to warrant a longer description, you can add a more thorough paragraph after the one line summary.

The next element of a docstring is an explanation of the function's arguments. Here you list the arguments, state their purpose, and what types the arguments should be.

Each of these pieces of the docstring is optional, as is the docstring itself. Remember though, it's always easier to write code than to read it! If you can make things easier for your collaborators (which includes future you!) to read your code, then you should.

You can read a more thorough explanation of docstring conventions at https://www.python.org/dev/peps/pep-0257/.


Default Arguments

We can specify default values for arguments of a function. However, these arguments must be placed at the end of the argument list. To understand this useful flexibility that python provides lets look at the famous star box problem and modify it.
Starbox problem: The function takes two arguments, width and height, that specify how many characters wide the box is and how many lines tall it is.


In [3]:
def box(width, height):
    """print a box made up of asterisks.

    width: width of box in characters, must be at least 2
    height: height of box in lines, must be at least 2
    """
    print('*' * width) # print top edge of box

    # print sides of box
    for _ in range(height-2):
        print('*' + " " * (width-2) + '*') 

    print('*' * width) # print bottom edge of box
    
box(5,5)


*****
*   *
*   *
*   *
*****

Lets now modify the problem to allow the box to be constructed by any symbol and if not specified use the asterisks. Do note the function definition now has a some new code in it. This signifies that if this argument is not provided use the default value.


In [5]:
def box(width, height, symbol='*'):
    """print a box made up of asterisks, or some other character.

    width: width of box in characters, must be at least 2
    height: height of box in lines, must be at least 2
    symbol: a single character string used to draw the box edges
    """
    print(symbol * width) # print top edge of box

    # print sides of box
    for _ in range(height-2):
        print(symbol + " " * (width-2) + symbol) 

    print(symbol * width) # print bottom edge of box

box(5,5)
box(5,5,'!')


*****
*   *
*   *
*   *
*****
!!!!!
!   !
!   !
!   !
!!!!!

Mutable Default Arguments

Default arguments are a helpful feature, but there is one situation where they can be surprisingly unhelpful. Using a mutable type (like a list or dictionary) as a default argument and then modifying that argument can lead to strange results. It's usually best to avoid using mutable default arguments: to see why, understand the example below.

Consider this function which adds items to a todo list. Users can provide their own todo list, or add items to a default list:


In [5]:
def todo_list(new_task, base_list=['Wake up']):
    base_list.append(new_task)
    return base_list
print(todo_list('Brush teeth'))

print(todo_list('Snooze alarm and go back to sleep'))


['Wake up', 'Brush teeth']
['Wake up', 'Brush teeth', 'Snooze alarm and go back to sleep']

In the above case you can see that the list is not re-initialised to hold the original value. This can prove troublesom in most cases.